##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
## Getting data from the ACS 1-year Supplemental Estimates. Data are available for geographies with populations of 20,000 and greater.
## Loading ACSSE variables for 2021 from table K202301 and caching the dataset for faster future access.
## corrplot 0.92 loaded
## Getting data from the ACS 1-year Supplemental Estimates. Data are available for geographies with populations of 20,000 and greater.
## Loading ACSSE variables for 2021 from table K201501 and caching the dataset for faster future access.
Exploratory Data Analysis (EDA)
Exploratory Data Analysis (EDA) based on percentage
## Warning: Using an external vector in selections was deprecated in tidyselect 1.1.0.
## ℹ Please use `all_of()` or `any_of()` instead.
## # Was:
## data %>% select(percentage_columns)
##
## # Now:
## data %>% select(all_of(percentage_columns))
##
## See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Getting data from the ACS 1-year Supplemental Estimates. Data are available for geographies with populations of 20,000 and greater.
## Loading ACSSE variables for 2021 from table K200501 and caching the dataset for faster future access.
## Warning: Removed 1 rows containing missing values (`position_stack()`).
## Warning: Removed 1 rows containing missing values (`position_stack()`).
## Warning: Removed 1 rows containing missing values (`position_stack()`).
## Getting data from the ACS 1-year Supplemental Estimates. Data are available for geographies with populations of 20,000 and greater.
## Loading ACSSE variables for 2021 from table K200104 and caching the dataset for faster future access.
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ readr 2.1.4
## ✔ lubridate 1.9.2 ✔ stringr 1.5.0
## ✔ purrr 1.0.2 ✔ tibble 3.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::map() masks maps::map()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## To enable caching of data, set `options(tigris_use_cache = TRUE)`
## in your R script or .Rprofile.
##
## Getting data from the ACS 1-year Supplemental Estimates. Data are available for geographies with populations of 20,000 and greater.
##
## Downloading feature geometry from the Census website. To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
##
## Loading ACSSE variables for 2021 from table K200104 and caching the dataset for faster future access.
##
|
| | 0%
|
|====== | 9%
|
|============ | 17%
|
|================== | 26%
|
|======================== | 35%
|
|=============================== | 44%
|
|===================================== | 52%
|
|=========================================== | 61%
|
|================================================= | 70%
|
|======================================================= | 79%
|
|============================================================= | 87%
|
|======================================================================| 100%
## Getting data from the 2021 1-year ACS
## The 1-year ACS provides data for geographies with populations of 65,000 and greater.
## Getting data from the ACS 1-year Supplemental Estimates. Data are available for geographies with populations of 20,000 and greater.
## Loading ACSSE variables for 2021 from table K202502 and caching the dataset for faster future access.
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
## Getting data from the ACS 1-year Supplemental Estimates. Data are available for geographies with populations of 20,000 and greater.
## Loading ACSSE variables for 2021 from table K201803 and caching the dataset for faster future access.
## # A tibble: 52 × 10
## NAME Total_people Total With Disabilit…¹ Hearing `Vision difficulty`
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Alabama 4957633 808071 208028 152798
## 2 Alaska 702154 92390 33397 15748
## 3 Arizona 7174053 972252 298849 180792
## 4 Arkansas 2974701 517051 142133 105624
## 5 California 38724294 4324355 1140131 844049
## 6 Colorado 5715497 640346 211803 120570
## 7 Connecticut 3557526 427014 113490 78078
## 8 Delaware 987964 130551 37933 25335
## 9 District of … 659979 76754 14429 14569
## 10 Florida 21465883 2906367 812248 555361
## # ℹ 42 more rows
## # ℹ abbreviated name: ¹`Total With Disabilities`
## # ℹ 5 more variables: cognative <dbl>, `ambulatory difficulty` <dbl>,
## # `Self-care difficulty` <dbl>, `Independent living difficulty` <dbl>,
## # `No Disability` <dbl>
##
## Call:
## lm(formula = `ambulatory difficulty` ~ Hearing + `Vision difficulty` +
## cognative + `Self-care difficulty` + `Independent living difficulty`,
## data = df6_wide)
##
## Residuals:
## Min 1Q Median 3Q Max
## -56621 -11941 -2766 17524 73868
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8200.5856 5646.4977 -1.452 0.153197
## Hearing 0.6796 0.1793 3.791 0.000436 ***
## `Vision difficulty` 1.0070 0.1644 6.126 1.87e-07 ***
## cognative -0.5426 0.2291 -2.369 0.022116 *
## `Self-care difficulty` -1.9464 0.4434 -4.390 6.59e-05 ***
## `Independent living difficulty` 1.9228 0.3144 6.115 1.95e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24940 on 46 degrees of freedom
## Multiple R-squared: 0.9968, Adjusted R-squared: 0.9965
## F-statistic: 2909 on 5 and 46 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = `ambulatory difficulty` ~ Hearing + `Vision difficulty` +
## cognative + `Self-care difficulty` + `Independent living difficulty`,
## data = df6_wide)
##
## Coefficients:
## (Intercept) Hearing
## -8200.5856 0.6796
## `Vision difficulty` cognative
## 1.0070 -0.5426
## `Self-care difficulty` `Independent living difficulty`
## -1.9464 1.9228
This project investigates the impact of socio-economic factors on employment rates across U.S. states. Utilizing ACS 2021 data, we explore relationships between employment and variables like education, citizenship, and housing. Key findings include significant correlations that inform employment dynamics in the U.S.
We aim to analyze various socio-economic factors influencing employment in the U.S. This study is crucial for understanding how different aspects like education, age, and housing contribute to employment rates, thereby aiding policymakers and researchers.
We have imported data set from the ACS survey. We have 6 child RMD
files for this project which has the data analysis for the Employment,
Education, Citizenship, Age, Housing, Disabilities Data Set (ACS
2021).
Further, we started exploring each data set in detail and then we
started combining each data set with the employment to see what results
we can expect. We did find many direct relationships with each data set
on employment data set. We have put our concluding results in the Final
Report to help us stand by with our conclusions.
The below data sets are from data.census.gov [ United States Census Bureau]. We shortlisted it based on ACS 2021, inclusive for all states in United States.
Employment - K202301
| Variable | Description |
|---|---|
| Total | Total Employment Data |
| In Labor Force | Total People in Labor Force |
| Civilian labor force: | Total People in Civilian Labor Force |
| Employed | Total People Employed |
| Unemployed | Total People Unemployed |
| In Armed Forces | Total People in Armed Forces |
| Not in labor force | Total People not in Labor Force |
Education - K201501
| Variable | Description |
|---|---|
|
Total Students in the Education Survery |
|
Number of students who have completed 9th grade |
|
Number of students who have completed 9th grade to 12th grade but no diploma |
|
|
|
|
|
|
|
|
|
Citizenship - K200501
| Variable | Description |
|---|---|
Age - K200104
| Variable | Description |
|---|---|
Housing - K202502
| Variable | Description |
|---|---|
Disabilities - K201803
| Variable | Description |
|---|---|